Sets Represented as the Length-n Factors of a Word

نویسندگان

  • Shuo Tan
  • Jeffrey Shallit
چکیده

In this paper we consider the following problems: how many different subsets of Σ can occur as set of all length-n factors of a finite word? If a subset is representable, how long a word do we need to represent it? How many such subsets are represented by words of length t? For the first problem, we give upper and lower bounds of the form α 2 n in the binary case. For the second problem, we give a weak upper bound and some experimental data. For the third problem, we give a closed-form formula in the case where n ≤ t < 2n. Algorithmic variants of these problems have previously been studied under the name “shortest common superstring”.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

مدل‌سازی بازشناسی واجی کلمات فارسی

Abstract of spoken word recognition is proposed. This model is particularly concerned with extraction of cues from the signal leading to a specification of a word in terms of bundles of distinctive features, which are assumed to be the building blocks of words. In the model proposed, auditory input is chunked into a set of successive time slices. It is assumed that the derivation of the underly...

متن کامل

General definitions for the union and intersection of ordered fuzzy multisets

Since its original formulation, the theory of fuzzy sets has spawned a number of extensions where the role of membership values in the real unit interval $[0, 1]$ is handed over to more complex mathematical entities. Amongst the many existing extensions, two similar ones, the fuzzy multisets and the hesitant fuzzy sets, rely on collections of several distinct values to represent fuzzy membershi...

متن کامل

The (non-)existence of perfect codes in Lucas cubes

A Fibonacci string of length $n$ is a binary string $b = b_1b_2ldots b_n$ in which for every $1 leq i < n$, $b_icdot b_{i+1} = 0$. In other words, a Fibonacci string is a binary string without 11 as a substring. Similarly, a Lucas string is a Fibonacci string $b_1b_2ldots b_n$ that $b_1cdot b_n = 0$. For a natural number $ngeq1$, a Fibonacci cube of dimension $n$ is denoted by $Gamma_n$ and i...

متن کامل

Counting Lyndon Factors

In this paper, we determine the maximum number of distinct Lyndon factors that a word of length n can contain. We also derive formulas for the expected total number of Lyndon factors in a word of length n on an alphabet of size σ, as well as the expected number of distinct Lyndon factors in such a word. The minimum number of distinct Lyndon factors in a word of length n is 1 and the minimum tot...

متن کامل

Intellectual structure of knowledge in Nanomedicine field (2009 to 2018): A Co-Word ‎Analysis

Introduction: The Co-word analysis has the ability to identify the intellectual structure of knowledge ‎in a research domain and reveal its subsurface research aspects.‎ Objective: This study examines the intellectual structure of knowledge in the field of nanomedicine ‎during the period of 2009 to 2018 by using Co-word analysis.‎ Materials and Methods: This paper develops a sciento...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013